Analyzing historical and future acute neurosurgical demand using an AI-enabled predictive dashboard

Characterizing acute service demand is critical for neurosurgery and other emergency-dominant specialties in order to dynamically distribute resources and ensure timely access to treatment. This is especially important in the post-Covid 19 pandemic period, when healthcare centers are grappling with a record backlog of pending surgical procedures and rising acute referral numbers. Healthcare dashboards are well-placed to analyze this data, making key information about service and clinical outcomes available to staff in an easy-to-understand format. However, they typically provide insights based on inference rather than prediction, limiting their operational utility. We retrospectively analyzed and prospectively forecasted acute neurosurgical referrals, based on 10,033 referrals made to a large volume tertiary neurosciences center in London, U.K., from the start of the Covid-19 pandemic lockdown period until October 2021 through the use of a novel AI-enabled predictive dashboard. As anticipated, weekly referral volumes significantly increased during this period, largely owing to an increase in spinal referrals (p < 0.05). Applying validated time-series forecasting methods, we found that referrals were projected to increase beyond this time-point, with Prophet demonstrating the best test and computational performance. Using a mixed-methods approach, we determined that a dashboard approach was usable, feasible, and acceptable among key stakeholders.


Data pre-processing
Following anonymisation, referral data was uploaded as a pandas data-frame. Redundant columns, duplicates and erroneous entries were removed, and all dates and times were transformed to python date-time data-types for further manipulation. Specialist working diagnoses are designated by the on-call neurosurgical registrar when receiving the referral and include a total of 138 different options. The diagnosis is based on the information received at the point of the referral and may be modified as further information is shared or after senior review. Specialist diagnoses were aggregated into 13 primary diagnostic categories: brain tumour, cauda equina syndrome, congenital, subdural haematoma, cranial trauma, degenerative spine, hydrocephalus, infection, spinal trauma, stroke, neurovascular and 'not neurosurgical' (Supplementary Appendix).

Implementation of time-series forecasting models
Three forecasting algorithms were trialled in this work: an automated pipeline which combined Seasonal and Trend decomposition using Loess (STL) with an automatic Autoregressive Integrated Moving Average (Auto-ARIMA) model, a Convolutional Neural Network -Long Short-Term Memory (CNN-LSTM) network and Prophet. In this section we describe how each model was implemented. We performed an exploratory analysis of the time-series using auto-correlation and partial auto-correlation plots in combination with augmented Dickey-Fuller testing to determine the degree of stationarity in the data and assist in defining initial parameters for seasonal decomposition and upper and lower parameter limits for the auto-ARIMA grid search. <CODE>

Usability, acceptability and feasibility
This study employed a mixed-method design to assess dashboard usability, acceptability and feasibility. Participants were recruited from the local neurosurgical centre through mailing lists and were included if they had an adequate experience of using the electronic referral system (> 6 months). Participants were excluded if they were aware of the development of the dashboard.
In each testing session, a demonstration of the dashboard's capabilities were shown (~ 10-minutes). As an example which would simulate a typical service evaluation, participants were shown how to use features to audit a particular diagnostic category or time-period. Using a think-aloud protocol, participants were invited to explore the functions of the dashboard independently, after which they completed an electronic questionnaire that incorporated three validated instruments: the System Usability Scale (SUS), Acceptability of Intervention Measure (AIM) and Feasibility of Intervention Measure (FIM) adapted for use. The SUS asks participants to respond to a set of 10 statements using a 5 point Likert scale, with a composite score above 70 defined as "good" usability.
In each of the AIM and FIM scales, participants were presented with 4 statements in reference to the 'intervention' (dashboard) and asked to rate these according to a 5-point Likert Scale. These statements have been previously assessed for substantive and discriminant content validity 3 . Two white-box questions were also incorporated into the questionnaire: "Which aspects or features of the dashboard did you find useful?" and "Do you have any suggestions for improving the dashboard?". The questionnaire has been outlined in full in the Supplementary Appendix.

Web application and synthetic data set
A trial version of the dashboard was hosted using Heroku (www.heroku.com), an online service allowing developers to deploy, manage and scale applications. A synthetic data set was created by taking the original anonymised data set and scrambling demographic and clinical variables, while keeping frequency of aggregate diagnostic classes and outcomes the same. Referral locations were shuffled and replaced with names and locations of English Premier League football stadiums to preserve referral site anonymity.